Cross-Language Personal Name Mapping
نویسنده
چکیده
Name matching between multiple natural languages is an important step in cross-enterprise integration applications and data mining. It is difficult to decide whether or not two syntactic values (names) from two heterogeneous data sources are alternative designation of the same semantic entity (person), this process becomes more difficult with Arabic language due to several factors including spelling and pronunciation variation, dialects and special vowel and consonants distinction and other linguistic characteristics. This paper proposes a new framework for name matching between Arabic language and other languages. The framework uses a dictionary based on a new proposed version of the Soundex algorithm to encapsulate the recognition of special features of Arabic names. The framework proposes new proximity matching algorithm to suit the high importance of order sensitivity in Arabic name matching. New performance evaluation metrics are proposed as well. The framework is implemented and verified empirically in several case studies demonstrating substantial improvements compared to other well-known techniques found in literature.
منابع مشابه
The Effect of Story Mapping on Writing Performance of Iranian EFL Learners
Although story mapping strategy has been shown to be beneficial in many reading comprehension classes, the benefits of this technique have not been thoroughly investigated in L2 writing research. The small number of previous studies (e.g., Li, 2007; Brunner, 2010) have found the potential benefits of using story mapping strategy on students’ writing performance, but they did not focus on differ...
متن کاملThe Effect of Story Mapping on Writing Performance of Iranian EFL Learners
Although story mapping strategy has been shown to be beneficial in many reading comprehension classes, the benefits of this technique have not been thoroughly investigated in L2 writing research. The small number of previous studies (e.g., Li, 2007; Brunner, 2010) have found the potential benefits of using story mapping strategy on students’ writing performance, but they did not focus on differ...
متن کاملTowards automatic cross-lingual acoustic modelling applied to HMM-based speech synthesis for under-resourced languages
Nowadays Human Computer Interaction (HCI) can also be achieved with voice user interfaces (VUIs). To enable devices to communicate with humans by speech in the user’s own language, low-cost language portability is often discussed and analysed. One of the most time-consuming parts for the language-adaptation process of VUIcapable applications is the target-language speech-data acquisition. Such ...
متن کاملLearning Phoneme Mappings for Transliteration without Parallel Data
We present a method for performing machine transliteration without any parallel resources. We frame the transliteration task as a decipherment problem and show that it is possible to learn cross-language phoneme mapping tables using only monolingual resources. We compare various methods and evaluate their accuracies on a standard name transliteration task.
متن کاملMapping it differently: A solution to the linking challenges
This paper reports the work of creating bilingual mappings in English for certain synsets of Hindi wordnet, the need for doing this, the methods adopted and the tools created for the task. Hindi wordnet, which forms the foundation for other Indian language wordnets, has been linked to the English WordNet. To maximize linkages, an important strategy of using direct and hypernymy linkages has bee...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1405.6293 شماره
صفحات -
تاریخ انتشار 2014